New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Even better(er) binary quantization #117994

Merged

elasticsearchmachine merged 14 commits into elastic:main from benwtrent:feature/even-better-binary-quantization

Dec 9, 2024

Member

benwtrent commented Dec 4, 2024

This measurably improves BBQ by adjusting the underlying algorithm to an optimized per vector scalar quantization.

This is a brand new way to quantize vectors. Instead of there being a global set of upper and lower quantile bands, these are optimized and calculated per individual vector. Additionally, vectors are centered on a common centroid.

This allows for an almost 32x reduction in memory, and even better recall than before at the cost of slightly increasing indexing time.

Additionally, this new approach is easily generalizable to various other bit sizes (e.g. 2 bits, etc.). While not taken advantage of yet, we may update our scalar quantized indices in the future to use this new algorithm, giving significant boosts in recall.

The recall gains spread from 2% to almost 10% for certain datasets with an additional 5-10% indexing cost when indexing with HNSW when compared with current BBQ.

benwtrent added 6 commits

December 2, 2024 13:22


          Adding optimized scalar quantized binarization

ef6b4bb


          Merge branch 'main' into feature/even-better-binary-quantization

43cfc73


          Moving towards using optimized scalar quantization for binary quantizer

f99b705


          adding more testing

33e6003


          adding testing

d86f659


          moving more code

bd64dd3

benwtrent added >enhancement auto-backport :Search Relevance/Vectors v9.0.0 v8.18.0 labels

elasticsearchmachine added the Team:Search Relevance label

Collaborator

elasticsearchmachine commented Dec 4, 2024

Pinging @elastic/es-search-relevance (Team:Search Relevance)


          Update docs/changelog/117994.yaml

370c64b

Collaborator

elasticsearchmachine commented Dec 4, 2024

Hi @benwtrent, I've created a changelog YAML for you.

benwtrent commented

View reviewed changes

.../src/main/java/org/elasticsearch/index/codec/vectors/es816/ES816BinaryFlatVectorsScorer.java

                   public RandomVectorScorerSupplier getRandomVectorScorerSupplier(
                       VectorSimilarityFunction similarityFunction,
                       KnnVectorValues vectorValues
-                  ) throws IOException {

Member Author

benwtrent Dec 4, 2024

removing write path for old codec

.../src/main/java/org/elasticsearch/index/codec/vectors/es816/ES816BinaryFlatVectorsScorer.java

Comment on lines -93 to -100

-                  RandomVectorScorerSupplier getRandomVectorScorerSupplier(
-                      VectorSimilarityFunction similarityFunction,
-                      ES816BinaryQuantizedVectorsWriter.OffHeapBinarizedQueryVectorValues scoringVectors,
-                      BinarizedByteVectorValues targetVectors
-                  ) {
-                      return new BinarizedRandomVectorScorerSupplier(scoringVectors, targetVectors, similarityFunction);
-                  }

Member Author

benwtrent Dec 4, 2024

removing write path for old codec

.../src/main/java/org/elasticsearch/index/codec/vectors/es816/ES816BinaryFlatVectorsScorer.java

                   @Override
                   public String toString() {
                       return "ES816BinaryFlatVectorsScorer(nonQuantizedDelegate=" + nonQuantizedDelegate + ")";
                   }
-                  /** Vector scorer supplier over binarized vector values */

Member Author

benwtrent Dec 4, 2024

removing write path for old codec

...main/java/org/elasticsearch/index/codec/vectors/es816/ES816BinaryQuantizedVectorsFormat.java

    
                      return new ES816BinaryQuantizedVectorsWriter(scorer, rawVectorFormat.fieldsWriter(state), state);

                      throw new UnsupportedOperationException();

Member Author

benwtrent Dec 4, 2024

removing write path for old codec

.../java/org/elasticsearch/index/codec/vectors/es816/ES816HnswBinaryQuantizedVectorsFormat.java

@@ @@ -25,10 +25,8 @@ @@
               import org.apache.lucene.codecs.hnsw.FlatVectorsFormat;
               import org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsFormat;
               import org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsReader;
-              import org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsWriter;

Member Author

benwtrent Dec 4, 2024

removing write path for old codec

...st/java/org/elasticsearch/index/codec/vectors/es816/ES816BinaryQuantizedRWVectorsFormat.java

+              /**
+               * Copied from Lucene, replace with Lucene's implementation sometime after Lucene 10
+               */
+              public class ES816BinaryQuantizedRWVectorsFormat extends ES816BinaryQuantizedVectorsFormat {

Member Author

benwtrent Dec 4, 2024

For old codec backward compat testing

...java/org/elasticsearch/index/codec/vectors/es816/ES816BinaryQuantizedVectorsFormatTests.java

@@ @@ -63,7 +63,7 @@ protected Codec getCodec() { @@
                       return new Lucene100Codec() {
                           @Override
                           public KnnVectorsFormat getKnnVectorsFormatForField(String field) {
-                              return new ES816BinaryQuantizedVectorsFormat();
+                              return new ES816BinaryQuantizedRWVectorsFormat();

Member Author

benwtrent Dec 4, 2024

For old codec backward compat testing

...test/java/org/elasticsearch/index/codec/vectors/es816/ES816BinaryQuantizedVectorsWriter.java

                   private final List<FieldWriter> fields = new ArrayList<>();
                   private final IndexOutput meta, binarizedVectorData;
                   private final FlatVectorsWriter rawVectorDelegate;
-                  private final ES816BinaryFlatVectorsScorer vectorsScorer;
+                  private final ES816BinaryFlatRWVectorsScorer vectorsScorer;

Member Author

benwtrent Dec 4, 2024

For old codec backward compat testing

...ava/org/elasticsearch/index/codec/vectors/es816/ES816HnswBinaryQuantizedRWVectorsFormat.java

+              import static org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsFormat.DEFAULT_MAX_CONN;
+              import static org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsFormat.DEFAULT_NUM_MERGE_WORKER;
+              class ES816HnswBinaryQuantizedRWVectorsFormat extends ES816HnswBinaryQuantizedVectorsFormat {

Member Author

benwtrent Dec 4, 2024

For old codec backward compat testing

.../org/elasticsearch/index/codec/vectors/es816/ES816HnswBinaryQuantizedVectorsFormatTests.java

@@ @@ -59,7 +59,7 @@ protected Codec getCodec() { @@
                       return new Lucene100Codec() {
                           @Override
                           public KnnVectorsFormat getKnnVectorsFormatForField(String field) {
-                              return new ES816HnswBinaryQuantizedVectorsFormat();
+                              return new ES816HnswBinaryQuantizedRWVectorsFormat();

Member Author

benwtrent Dec 4, 2024

For old codec backward compat testing

benwtrent added 3 commits

December 4, 2024 13:06


          Merge remote-tracking branch 'upstream/main' into feature/even-better…

afa2bba

…-binary-quantization


          fixing fmt and headers

368602a


          Merge branch 'feature/even-better-binary-quantization' of github.com:…

3d20c5b

…benwtrent/elasticsearch into feature/even-better-binary-quantization

benwtrent requested review from mayya-sharipova, tveasey and john-wagster

December 4, 2024 19:22

john-wagster approved these changes

View reviewed changes

Contributor

john-wagster left a comment

LGTM

pmpailis reviewed

View reviewed changes

server/src/main/java/org/elasticsearch/index/codec/vectors/es818/OptimizedScalarQuantizer.java

+                      double xe = 0.0;
+                      double e = 0.0;
+                      for (double xi : vector) {
+                          double xiq = (a + step * Math.round((clamp(xi, a, b) - a) * stepInv));

Contributor

pmpailis Dec 6, 2024

yeap, this seems about right 😅

Member Author

benwtrent Dec 6, 2024

LOL, I should indeed add some comments here. We are basically calculating the error of quantizing and then unquantizing and shifting up or down based on the mis-calculation.


          Merge remote-tracking branch 'upstream/main' into feature/even-better…

618c2c3

…-binary-quantization

pmpailis reviewed

View reviewed changes

Contributor

pmpailis left a comment

Reviewed tests and the main set of changes against the 816* versions and they LGTM. The math changes are a bit more difficult to review; but will give it a more thorough go on Monday 😅 (no need to wait though; please feel free to proceed w/o waiting for my review on that part)


          java docs

d419964

Contributor

mayya-sharipova commented Dec 6, 2024

@benwtrent and Tom V. amazing work! It would be nice to add some documentation to the format: it looks like the queries are still 4 bits quantized?

Member Author

benwtrent commented Dec 6, 2024

It would be nice to add some documentation to the format: it looks like the queries are still 4 bits quantized?

The on disk format is very similar. I can add some docs on that to the format

mayya-sharipova approved these changes

View reviewed changes

Contributor

mayya-sharipova left a comment

@benwtrent Thanks Ben! Great work!

Similar to Panos, I reviewed file formats, as don't follow all the math in quantizer but I trust you and Tom got it right.

benwtrent added 2 commits

December 9, 2024 09:50


          Merge remote-tracking branch 'upstream/main' into feature/even-better…

1c44537

…-binary-quantization


          adding some format docs

ef147d8

benwtrent added the auto-merge-without-approval label

elasticsearchmachine merged commit 5e859d9 into elastic:main

16 checks passed

benwtrent deleted the feature/even-better-binary-quantization branch

December 9, 2024 16:06

elasticsearchmachine added the backport pending label

Collaborator

elasticsearchmachine commented Dec 9, 2024

💔 Backport failed

Status	Branch	Result
❌	8.x	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 117994

benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request


          Even better(er) binary quantization (elastic#117994)

f5dfe16

This measurably improves BBQ by adjusting the underlying algorithm to an
optimized per vector scalar quantization.

This is a brand new way to quantize vectors. Instead of there being a
global set of upper and lower quantile bands, these are optimized and
calculated per individual vector. Additionally, vectors are centered on
a common centroid.

This allows for an almost 32x reduction in memory, and even better
recall than before at the cost of slightly increasing indexing time.

Additionally, this new approach is easily generalizable to various other
bit sizes (e.g. 2 bits, etc.). While not taken advantage of yet, we may
update our scalar quantized indices in the future to use this new
algorithm, giving significant boosts in recall.

The recall gains spread from 2% to almost 10% for certain datasets with
an additional 5-10% indexing cost when indexing with HNSW when compared
with current BBQ.

benwtrent mentioned this pull request

[8.x] Even better(er) binary quantization (#117994) #118295

Merged

elasticsearchmachine pushed a commit that referenced this pull request


          [8.x] Even better(er) binary quantization (#117994) (#118295)

ffc5978

* Even better(er) binary quantization (#117994)

This measurably improves BBQ by adjusting the underlying algorithm to an
optimized per vector scalar quantization.

This is a brand new way to quantize vectors. Instead of there being a
global set of upper and lower quantile bands, these are optimized and
calculated per individual vector. Additionally, vectors are centered on
a common centroid.

This allows for an almost 32x reduction in memory, and even better
recall than before at the cost of slightly increasing indexing time.

Additionally, this new approach is easily generalizable to various other
bit sizes (e.g. 2 bits, etc.). While not taken advantage of yet, we may
update our scalar quantized indices in the future to use this new
algorithm, giving significant boosts in recall.

The recall gains spread from 2% to almost 10% for certain datasets with
an additional 5-10% indexing cost when indexing with HNSW when compared
with current BBQ.

* fixing backport

* fixing test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport auto-merge-without-approval backport pending >enhancement :Search Relevance/Vectors Team:Search Relevance v8.18.0 v9.0.0